[BenchGC] add tuner tools for benchgc #358

xurui1995 · 2024-09-24T07:09:37Z

add tuner tools for the benchgc to support auto-tuning

WangJialei-A · 2024-09-24T07:23:56Z

test/benchgc/src/benchgc/tuner/op_config.py

+        default_blocks = [16, 32, 64, 128, 256, 512]
+        default_innermost_blocks = [16, 32]
+        self.field_candidates["M_threads"] = find_factors(self.num_threads)
+        self.field_candidates["K_threads"] = find_factors(self.num_threads)
+        self.field_candidates["N_threads"] = find_factors(self.num_threads)
+        self.field_candidates["M_block"] = [
+            block for block in default_blocks if self.M >= block
+        ]
+        self.field_candidates["K_block"] = [
+            block for block in default_blocks if self.K >= block
+        ]
+        self.field_candidates["N_block"] = [
+            block for block in default_blocks if self.N >= block
+        ]
+        self.field_candidates["innermostM_block"] = [
+            block for block in default_innermost_blocks if self.M >= block
+        ]
+        self.field_candidates["innermostK_block"] = [
+            block for block in default_innermost_blocks if self.K >= block
+        ]
+        self.field_candidates["innermostN_block"] = [
+            block for block in default_innermost_blocks if self.N >= block
+        ]


It is better to provide the grid options by command line. Developer can control the search space in this way.

…aph-compiler into xurui/benchgc_tuner

WangJialei-A · 2024-09-24T07:36:40Z

test/benchgc/src/benchgc/tuner/tuner.py

+    def save_status(self):
+        save_dict = {
+            "iter": self.iter,
+            "last_update_iter": self.last_update_iter,
+            "best": self.best,
+            "best_cost": self.best_cost,
+            "current_idx": self.current_idx,
+            "skipped_num": self.skipped_num,
+        }
+        with open(self.checkpoint, "w") as file:
+            json.dump(save_dict, file, indent=4)
+
+    def load_status(self):
+        print("continue tuning from checkpoint...")
+        with open(
+            self.checkpoint,
+            "r",
+        ) as file:
+            try:
+                data = json.load(file)
+                assert set(
+                    [
+                        "iter",
+                        "last_update_iter",
+                        "best",
+                        "best_cost",
+                        "current_idx",
+                        "skipped_num",
+                    ]
+                ) == set(data.keys())
+                self.iter = data["iter"]
+                self.last_update_iter = data["last_update_iter"]
+                self.best = data["best"]
+                self.best_cost = data["best_cost"]
+                self.current_idx = data["current_idx"]
+                self.skipped_num = data["skipped_num"]
+            except Exception as e:
+                print("load checkpoint failed", e)


Do we really need this feature? Is tuning a time cost job?

WangJialei-A · 2024-09-25T07:33:25Z

@xurui1995
I would like to suggest to use the mature tuning framework such as optuna to handle this problem
Please see the example here.
https://optuna.readthedocs.io/en/stable/reference/samplers/generated/optuna.samplers.GridSampler.html

ciyongch · 2024-09-27T01:29:39Z

@xurui1995 I would like to suggest to use the mature tuning framework such as optuna to handle this problem Please see the example here. https://optuna.readthedocs.io/en/stable/reference/samplers/generated/optuna.samplers.GridSampler.html

It seems a good idea to use the existing auto-tuning fwk, let's evaluate if it could meet our requirement for the tuning features, for example, arbitrary tuning space, check-point save and restore, early stop, distributed tuning.

yifeizh2 · 2024-09-30T06:04:57Z

test/benchgc/src/benchgc/tuner/op_config.py

+
+    def attach_to_ir(self, op: OpView):
+        attr_to_field = {
+            "Mthreads": self.M_threads,


Currently MatmulConfigAnalysis.cpp reads named attribute MThreads instead of Mthreads. Please align the naming convention here (also for Kthreads and Nthreads).

yifeizh2 · 2024-10-14T14:26:04Z

test/benchgc/src/benchgc/tuner/README.md

+        "MBlock": 128,
+        "KBlock": 64,
+        "NBlock": 16,
+        "innerMostMBlock": 32,


Typo, shall be innermost with lower case m to match matmul config.

yifeizh2 · 2024-10-14T16:15:48Z

test/benchgc/src/benchgc/tuner/op_config.py

+                self.innermost_k_block,
+                self.innermost_n_block,
+            ],
+            [self.m, self.k, self.n],


The order here shall be m/n/k

yifeizh2 · 2024-10-15T06:58:47Z

test/benchgc/src/benchgc/tuner/README.md

+## Options
+Since bench is also required within the tuner, the tuner also supports benchmarking options.
+Unlike bench mode, in tuner mode, a batch quantity of modules is generated each time, and The default values for warm-up and repeat have been adjusted accordingly.
+* --bench_kind [py, grid]


py & wrapper?

ciyongch · 2024-10-15T08:13:27Z

test/benchgc/src/benchgc/tuner/tuner.py

+            self.tunning_space.initial_ir,
+        )
+
+    def run(self, max_iter: int = DEFAULT_MAX_ITERS, timeout: int = DEFAULT_TIMEOUT):


Can we support module construction in parallel, and then executing them one by one in sequence to reduce the compilation time?

xurui1995 added 2 commits September 23, 2024 00:48

add tuner

6f4645f

update readme

f9e1fb0

xurui1995 added ready to review validation labels Sep 24, 2024

xurui1995 self-assigned this Sep 24, 2024

xurui1995 linked an issue Sep 24, 2024 that may be closed by this pull request

add tuner mode for BenchGC to support auto-tuning #330

Open

xurui1995 added 3 commits September 24, 2024 00:13

fix style

6cdcfeb

update readme

b545f67

fix style

8f27020

WangJialei-A reviewed Sep 24, 2024

View reviewed changes

xurui1995 added 3 commits September 24, 2024 15:27

Merge branch 'main' into xurui/benchgc_tuner

faf3e76

rm verbose in correctness.sh

1f237f9

Merge branch 'xurui/benchgc_tuner' of https://github.com/xurui1995/gr…

60d3443

…aph-compiler into xurui/benchgc_tuner

xurui1995 linked an issue Sep 24, 2024 that may be closed by this pull request

No verbose on correctness check script #352

Open

WangJialei-A reviewed Sep 24, 2024

View reviewed changes

xurui1995 requested a review from zhczhong September 24, 2024 07:37

fix

662cb3b

yifeizh2 reviewed Sep 30, 2024

View reviewed changes

xurui1995 added 4 commits October 7, 2024 19:43

fix config field name

7635044

Merge branch 'main' into xurui/benchgc_tuner

a55dbe7

add check matmul config from binding

07b698c

fix

d7e952e

yifeizh2 reviewed Oct 14, 2024

View reviewed changes

xurui1995 added 2 commits October 15, 2024 09:34

fix name

f64715e

fix order of m n k

afa0b41

yifeizh2 reviewed Oct 15, 2024

View reviewed changes

ciyongch reviewed Oct 15, 2024

View reviewed changes

xurui1995 added 2 commits October 17, 2024 10:12

fix attr name

085ef35

FIX

07a94f0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BenchGC] add tuner tools for benchgc #358

[BenchGC] add tuner tools for benchgc #358

xurui1995 commented Sep 24, 2024

WangJialei-A Sep 24, 2024

WangJialei-A Sep 24, 2024

WangJialei-A commented Sep 25, 2024

ciyongch commented Sep 27, 2024

yifeizh2 Sep 30, 2024

yifeizh2 Oct 14, 2024

yifeizh2 Oct 14, 2024

yifeizh2 Oct 15, 2024

ciyongch Oct 15, 2024

[BenchGC] add tuner tools for benchgc #358

Are you sure you want to change the base?

[BenchGC] add tuner tools for benchgc #358

Conversation

xurui1995 commented Sep 24, 2024

WangJialei-A Sep 24, 2024

Choose a reason for hiding this comment

WangJialei-A Sep 24, 2024

Choose a reason for hiding this comment

WangJialei-A commented Sep 25, 2024

ciyongch commented Sep 27, 2024

yifeizh2 Sep 30, 2024

Choose a reason for hiding this comment

yifeizh2 Oct 14, 2024

Choose a reason for hiding this comment

yifeizh2 Oct 14, 2024

Choose a reason for hiding this comment

yifeizh2 Oct 15, 2024

Choose a reason for hiding this comment

ciyongch Oct 15, 2024

Choose a reason for hiding this comment